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PROJECT SUMMARY 



In the following slides a research project will be 
presented in which state-of-the-art FT-NIR 
instruments were evaluated and utilized, data 
analysis and calibration methodology were 
substantially improved. The goal was to allow 
rapid NIR analysis of the large number of soybean 
samples required for improving the genetics of 
soybean seed composition for agricultural cost 
savings and human health food applications. 



PROJECT SUMMARY, cont'd 



Calibrations are developed for three selected amino acid 
groups that include essential amino acids for identified 
soybean accessions. 

Conventional "wet chemistry" analytical methods are time- 
consuming and costly. As a result, soybean breeders and 
researchers have an imperative need to utilize faster and 
less expensive methods. NIR Spectroscopy is a rapid and 
inexpensive method for composition analysis for academia 
and industry. Recent advancements in instrumentation 
design, such as the application of the Diode Array (DA) 
technique and the Fourier Transform (FT) IR and NIR 
techniques, have significantly improved overall instrument 
performance and advancement in the field of grain analysis. 



Project Description 



Overall Goals 



•To design improved approaches to NIR 
calibrations for amino acid composition 
analysis of soybean proteins, soybeans and 
soybean food and feed matrix formulations 

•To develop NIR calibration transfer 
methodology between different laboratories for 
various food and feed applications 



Research Objectives 



To produce NIR amino acid calibrations for a small group of 
three amino acids. 

To investigate and develop methodology for NIR spectra pre- 
processing and data analysis to improve both the accuracy 
and reliability of NIR measurements of soybean seed 
composition. 

To develop and optimize NIR Spectroscopy calibrations for 
determination of the above-mentioned amino acids in 
soybean seeds. 



Research Objectives 



To obtain reproducibly calibration plots of the above- 
mentioned four individual amino acids in soybean lines, 
using PLS-1 and PLS-2 regression algorithms 

To perform multivariate analysis to better resolve the 
individual amino acid calibrations. 

To carry out the analysis and comparison of soy 
proteins of similar amino acid composition in powder vs. 
gel vs. liquid suspension, in order to investigate the 
matrix effect on the NIR calibrations. 

To compare reduced and unreduced soybean protein 
calibrations, to improve the cysteine vs. cystine NIR 
calibrations. 



Research Objectives 



1 . To compare our results with the results of other 
laboratories' data, in order to investigate the 
transferability of NIR calibrations among different 
laboratories. 

2. To compare results with 13 C (Waltz;WAHUHA) ) NMR 
and GC-MS primary data for amino acid composition. 
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Objectives, cont'd 



Evaluate multivariate analysis methodology for NIR 
calibrations in order to determine soybean protein and 
soybean amino acid residue contents (go to wiki, or google, 
etc., and learn some) 

Investigate the potential of NIR spectroscopy for developing 
calibrations for amino acids and amino acid mixtures [amino 

acid triple matrix method] [specify which amino acids and which 
mixtures - specify aa's, but for the mixtures, refer to the table in the next page] 

Generate NIR calibrations of the soybean protein amino acid 
residues specified in the table on the next page, based on 
high-resolution nuclear magnetic resonance analysis 



Background 



The soybean: 

• more than 3 Billion bushels produced in the US each year 
(USDA, 2007) 

• Major source of plant protein and oil (and a high-level plant 
source of Methionine and Tryptophan 

• Protein content from different soybean cultivars vary greatly 

• Some have over 50% protein (dry wt.) 

• Some accessions show significantly higher Methionine 
(—19%*) and Cysteine levels 



*Kuiken, 1948 
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Research Objectives 



Objectives that need to be addressed in order to 
realize NIR's utility in practical food applications and 
lab analysis: 

Generate an NIRS calibration for all of the amino acid residues in 
soybean proteins that are determined by a primary method 

Obtain an NIRS calibration for selected groups of three amino acid 
triplets, such as Arg-Lys-Glx, Asx-Ala(or Val)-Pro, or 
(Met+Cys)-Arg-VaL 

(Can such determinations result in significant savings of time 
and money in research and food industry labs? 

Obtain appropriate and accurate values for use in precision 
formulation matrices. 



Research Objectives 



To produce NIR amino acid calibrations for a small group of 
three amino acids. 

To investigate and develop methodology for NIR spectra pre- 
processing and data analysis to improve both the accuracy 
and reliability of NIR measurements of soybean seed 
composition. 

To develop and optimize NIR Spectroscopy calibrations for 
determination of the above-mentioned amino acids in 
soybean seeds. 
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calibrations, to improve the cysteine vs. cystine NIR 
calibrations. 



Background 



Soybean Uses 



Main Soybean Growing Countries : 

• United States, Brazil, Argentina, China and India 

Some final products from soybean processing : 

• Foods, Nutraceuticals, soy isoflavones 

• TVP 

• Animal feed 

• Adhesives, Fibers, Lininj 

• Foams 

• Fertilizers 



Usage in Industry 



Developmental Labs and Grain Labs in industry 
have been reluctant to use NIR because of the 
low quality of instruments available (until 
recently) and to an extent, the lack of proper 
calibrations 

It has already been used in the area of new grain 
development, genetic selection and cross- 
breeding. 

Because of its high sensitivity, NIR is useful as a 
rapid and inexpensive screening tool, despite not 
having very high resolution, if a robust and 
accurate calibration can be generated. 



Current Status 



The Food Industry and Nutritional Sciences have a great need for 
rapid techniques that are economical, accurate, reproducible and 
nondestructive. 

Protein Quality is an important processing and nutritional 
attribute 



Soybean protein and amino acids 
- best method 

• The methods that exist for judging protein quality are mostly 
destructive, and possess severe limitations, like changing the 
structure of amino acids before they're quantitated. 

• SS NMR is an established method used to identify the aa 
residues, but has resolution limitations. However, NMR can 
be done in liquids or gels that improves the resolution. 

• Drawback: NMR takes much longer than NIR 

• NIR is a powerful secondary technique 
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UIUC NIR Soybean Database 

• Our high-resolution NIR calibrations and 
methodologies were employed to carry out a large 
number of protein and oil composition analyses of 
soybean seeds (:80,000) for breeding and selection 
purposes, over a period of three years. 

• A wide variety of soybean developmental lines and more than 5 , 000 
exotic soybean germplasm accessions were thus characterized accuratel 
and reproducibly (Source: UIUC Soybean NIR Database) . 

• Such results demonstrate the usefulness of this novel 
NIR approach for soybean selection and breeding 
purposes. They also validate our NIR calibrations 
undertaken in parallel with the higher resolution (but 
slower and more expensive) NMR measurements. 



Applications 



•R&D 

• Food Formulations and Protein Quality Amino Acid 
Composition Analysis 

• Food Developments 

• Food Safety and Microbiology Applications 

• Health Foods 

• Nutraceuticals 

• Nutrition Research 

• Agricultural Feeds and Pet Foods 



Biomedical Applications 



High-resolution NIR Chemical Imaging may 
also enable rapid and sensitive analyses with 
micro-arrays for Nucleic Acids, multiple 
Molecular Bioassays, Automated Proteomics, 
Biotechnology, Biomedical & Pharmaceutical 
Applications, such as those aimed at early 
Detection of Cancer and Prevention. 
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Previous Studies 



Brazil, 2003 to 2006 

•Collected soybean samples in several different states in 
Brazil 

•Analyzed amino acids in soybeans using: 

• HPLC 

• Derivatized HPLC 

•Individual amino acid compositions of the soybean samples 
showed significantly different amino acid mean levels (p<0.01 and 
p<0.02) - Met, Lys, and Thr 

•Tentative Conclusion: it is possible to calculate the a. a. content of 
a sample for several amino acids by comparison with a primary 
method; however, primary data was not available for several 
essential amino acids. 



Rationale and Significance 



Both major advancements in instrumentation, and 
improved data analysis/novel calibration 
methodologies are necessary to improve the 
accuracy and reliability of NIR for measuring low- 
level components such as individual amino acids. 



Rationale and Significance 



• High-protein, high-yield cultivars increase the soybean 
crop value. Conventional, or "wet chemistry", methods 
are time-consuming, expensive and impractical for 
repetitive measurements required for genetic selection 
and breeding experiments to increase both protein 
content and the agronomic yield values of soybean 
cultivars 

• Faster and less expensive methods for protein, oil, 
moisture and amino acid analysis of soybeans are 
needed 
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Rationale and Significance 



Novel NIR instrumentation techniques - combined 
with improved data analysis and calibration 
methodologies - are essential for selecting soybean 
cultivars with both high quality protein composition 
and high agronomic yield. 

Such improved NIR analysis can also result in 
enormous cost and time savings for the amino acid 
composition analysis that is required for example by 
soybean breeders in genetic selection experiments. 



NIRAA Calibration Data 



Flow-Chart of the Steps in a Novel Approach for 
Amino Acid NIRS Analysis of Soybeans and Proteins 

Selection of Standard Samples with the widest Range of Equally-Spaced Values (ES-ROV) 



Primary Methods of Analysis for: 

(AA , P , M and O (oil) 



Precision Formulation Matrix for Calculatin 
Concentration Steps 

i 

NIRS Measurements 



Next Group of 
Amino Acids 

\ 



PLS-1 



PLS-2 (n<6: SAA+P+M+O) 



Three AA Composition 
Results 



Correlation test with 
Protein %D 



Selection Criteria: 
High ROV > 20% 
and Low Correlation: < 90% 
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Selection of an Amino 
Acid Triplet Group 
for Calibration 



Protein Calibration for Bulk Soybean Analysis on 
the Spectrum One NTS FT-NIR Instrument 




65 calibration standards, 20 grams for each standard, 8.9mm NIR beam size 

Source: Soybean NIR Database, UIUC 



Protein Calibration for Bulk Soybean 
Analysis on the DA-7000 Diode Array NIR 
Instrument 




Source: Soybean NIR Database, UIUC 



Calibration Results for Protein and Oil 
Analysis with the Perten DA-7000, Dual 
Diode-Array DA-NIR Instrument 
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Source: Soybean NIR Database, UIUC 



Moisture Calibration for Bulk Soybean Analysis 
on the DA-7000 Diode Array NIR Instrument 




Source: Soybean NIR Database, UIUC 



Comparison of NIR Dry Soy Protein 

Data(Primary data: Sigma Method-Lowry modified) 
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Amino Acids Highly-Correlated 

with the Dry Protein Content: 



• Histidine: R = 0.93 

• Arginine: R = 0.90 



Glx: R = 0.88 

Valine: R = 0.87 

Leucine: R = 0.8 



and the imino acid Proline: R — 0.87 
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biology.arizona.edu/biochemistry/problem_sets/aa/Arginine.html 



Amino Acid Contents of Soybeans Determined by 
GCMS and Correlation with Total Dry Protein 

% Tot Dry Wt. ARGININE vs. D% Protein 
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Amino Acid Content of Soybeans Determined 
by GCMS: Correlation with Total Dry Protein 
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Glutamine plus Glutamic Acid, "GLX", as 
Total Dry Weight % vs. Dry Soybean Protein % 
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Glutamine plus Glutamic Acid, "GLX", as 
Total Dry Weight % vs. Dry Soybean Protein % 



DP new% (ZX-50) vs GLX% (HPLC) 
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Amino Acids Highly-Correlated 

with the Dry Protein Content: 



• Histidine: R= 0.93 

• Arginine: R= 0.90 

• Glx: R= 0.88 

• Valine: R= 0.87 

• Leucine: R= 



and the Imino Acid Proline: R — 0.87 



Amino Acid Primary Data (GC-MS) 



Amino Acid Primary Data: 

(sample of 401 data points out of a total of 3,618) 
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2.1 


7.8 


2.1 


1.8 


1.8 


2.0 


1.9 


3.2 


1.3 


2.1 


1.1 


2.7 


3.1 


0.7 


0.8 


1.5 


4 


4.8 


1.6 


2.1 


7.5 


2.0 


1.7 


1.7 


2.0 


1.9 


3.2 


1.3 


2.1 


1.1 


2.6 


3.0 


0.7 


0.7 


1.4 


5 


4.9 


1.6 


2.1 


7.4 


2.1 


1.8 


1.8 


2.0 


1.9 


3.2 


1.2 


2.1 


1.0 


2.6 


2.8 


0.7 


0.8 


1.5 


6 


4.9 


1.6 


2.1 


7.6 


2.1 


1.8 


1.8 


2.0 


1.9 


3.2 


1.2 


2.2 


1.1 


2.6 


3.0 


0.7 


0.7 


1.4 


7 


5.3 


1.8 


2.4 


8.4 


2.4 


1.9 


1.9 


2.1 


2.0 


3.4 


1.5 


2.2 


1.2 


2.8 


3.5 


0.6 


0.8 


1.5 


8 


5.0 


1.6 


2.1 


7.8 


2.2 


1.8 


1.8 


2.1 


1.9 


3.3 


1.4 


2.1 


1.1 


2.7 


3.1 


0.7 


0.8 


1.5 


9 


5.1 


1.7 


2.3 


7.9 


2.2 


1.9 


1.9 


2.1 


2.0 


3.3 


1.3 


2.2 


1.1 


2.7 


3.1 


0.7 


0.9 


1.7 


10 


5.7 


1.4 


2.7 


9.1 


2.3 


2.1 


20 


2.2 


2.2 


3.6 


1.6 


2.4 


1.2 


3.0 


3.7 


0.8 


1.0 


1.9 


11 


6.2 


1.6 


2.9 


9.7 


2.5 


2.2 


2.6 


2.3 


2.2 


3.9 


1.6 


1.8 


1.3 


3.1 


4.1 


0.9 


1.0 


1.9 


12 


6.1 


1.7 


2.8 


9.5 


2.5 


2.2 


2.1 


2.4 


2.3 


3.8 


1.7 


2.5 


1.4 


3.2 


4.2 


0.9 


1.1 


1.9 


13 


5.9 


1.5 


2.8 


9.3 


2.4 


21 


20 


2.3 


2.2 


3.8 


1.6 


2.5 


1.2 


3.1 


3.8 


0.9 


1.1 


1.9 



15 6.0 

16 5.7 

17 6.0 

18 5.9 

19 5.7 

20 5.8 

21 5.9 

22 5.7 

23 5.6 

24 5.3 

25 5.4 



1.5 2.8 

1.5 2.7 

1.5 2.8 

1.5 2.7 
1.8 2.5 
1.4 2.8 
1.8 2.6 
1.8 2.5 
1.7 2.3 
1.4 2.6 

1.6 2.4 



9.2 2.3 

8.8 2.4 
8.7 2.3 

9.3 2.4 

8.9 2.4 

9.1 2.6 

9.3 2.5 
8.9 2.4 
8.5 2.2 

8.2 2.2 

8.4 2.3 



2.1 2.1 

2.0 2.0 

2.0 2.0 

2.1 2.1 
2.0 2.0 

2.0 2.0 

2.1 2.0 
2.1 2.0 
1.9 1.9 
1.9 1.9 
1.9 1.9 



2.3 2.2 

2.3 2.0 

2.2 2.1 

2.3 2.2 
2.2 2.1 

2.2 2.2 

2.3 2.2 
2.3 2.1 
2.1 2.0 
2.1 2.0 
2.1 2.0 



3.8 1.5 

3.6 1.6 

3.6 1.6 

3.7 1.6 

3.6 1.6 

3.7 1.6 

3.8 1.6 
3.7 1.5 
3.5 1.5 
3.4 1.5 
3.4 1.4 



2.4 1.3 

2.3 1.2 

2.4 1.2 
2.4 1.3 
2.4 1.2 
2.4 1.4 

2.2 1.2 
2.4 1.2 

2.3 1.2 
2.2 1.2 
2.2 1.1 



3.0 4.1 
2.9 4.0 
2.9 3.9 

3.1 3.9 
3.0 3.8 
3.0 3.8 
3.0 3.8 
3.0 3.8 
2.8 3.7 
2.8 3.5 
2.7 3.5 



Amino Acid ROVs 



AA 


ASX 


THR 


SER 


GLX 


PRO 


GLY 


ALA 


VAL 


ILE 


Max 


6.20 


1.82 


2.90 


9.67 


2.60 


2.18 


2.57 


2.37 


2.27 


Min 


4.73 


1.35 


2.04 


7.23 


1.95 


1.72 


1.72 


1.90 


1.78 


Ratio 


0.24 


0.26 


0.30 


0.25 


0.25 


0.21 


0.33 


0.20 


0.22 



AA 




TYR 


PHE 


HIS 


LYS 


ARG 


MET 


CYS 


met+c 


Max 


3.89 


1.66 


2.53 


1.36 


3.20 


4.67 


0.88 


1.08 


1Y$5 


Mill 


3.06 


1.11 


1.54 


1.00 


2.52 


2.73 


0.65 


0.72 


1.38 


Ratio 


0.21 


0.33 


0.39 


0.26 


0.21 


0.41 


0.27 


0.34 


0.29 


Legen 
d: 




Red 
(poor; 
low) 




Green 
(accep 
table) 






Blue 

(good; 
high) 









Comparison Between Amino Acid Contents of Soybean 
Seeds Determined by Primary Data: 
13 C Liquid State HR NMR and IEC 



Wt. Ala Val Leu lie Gly Asn Asp Asx Gin Glu Glx 

% 

total 



NMR 


5 


5 


7 


4.5 


4 




5 




11 


8 


19 


IEC 


4 


5.2 


7.3 


4.7 


2.8 


ND 


ND 


12.1 


ND 


ND 


21. 

3 


Wt. 
% 
total 


Ser 


Thr 


Arg 


Lys 


Trp 


Tyr 


His 


Phe 


Cys 


Met 




NMR 


5 


4 


8 


7 


1 


3 


3 


6 


1.5 


1 




IEC 


4.6 


3.6 


9.5 


7.8 


ND 


3.5 


2.6 


5.5 


ND 


0.9 





SOY PROTEIN and MOISTURE DATA, 
with Soybean Accession Identifiers 



ID v = very, It = light, grn = green, 
brn = brown, bl = black 



acid 

source 

PI536636 

PI548518 

PI548477 

PI548379 

PI548603 

PI533655 

PI548311 

PI548659 

PI567551 

785 

PI458057 
PI567704 
2482 



coat color 

Y 
Y 

Lgn 

Y 
Y 
Y 
Y 
Y 

ltGn,l/10 blk 



cultivar seed 

Ripley 93U-5051 
Cutler 71 92U-901 
Ogden 99S-4040 
MandarinOOU-109 



Perry 
Burlison 
Capitol 
Braxton 



98U-1459 
95U-2238 
97U-2231 
94S-6 

Huang li 94U- 



DkGn 96U-2173 
Y, 1/2 brnFu yang (23) 93U- 



VIP-Data 
lODataPoints 
Prot Moist 

DO calc 
35.57 11.10 

23.06 
36.91 10.00 

23.07 
41.75 10.41 

19.04 
41.44 10.30 

11.17 
41.10 10.16 

20.70 
36.71 9.12 

22.67 
38.03 10.84 

22.70 
39.52 10.44 

21.59 
40.17 9.04 

20.67 
40.67 10.25 

20.39 



Example: 
rAL: 3,816data 

il DP 

20.50 40.01 

20.76 41.00 

17.06 46.60 

10.02 46.20 

18.60 45.75 

20.60 40.40 

20.24 42.65 

19.34 44.13 
18.80 

18.30 45.31 



points 

DP calc 

40.01 
41.01 
46.60 
46.20 
45.75 
40.39 
42.65 
44.13 
44.16 
45.31 



Tyrosine vs. %DP 







DP new% (UIUC) vs TYR 






n=49 




1.7 






I .O 






1 .0 




TYR 


1.4 - 






1.3 - 


.XT* ♦ y = 0.0382X - 0.3778 




^* R 2 = 0 5361 




1.2 - 


♦ ♦♦ 




1.1 


♦ 

11 iii 




42 


44 46 48 50 52 54 
Protein 





Aromatic 
Non-polar 

Not as hydrophobic as 
Phe 

Pk, = 2.2 
pK 2 = 9.1 



Amino Acid Primary Data (GC-MS), cont'd 




Amino Acid Primary Data (GC-MS), cont'd 




Amino Acid Primary Data (GC-MS), cont'd 




Amino Acid Summary 



• Total protein measurement is by wet chemistry analysis, 
for which a ~%% correlation with our lab's NIR analysis 
is confirmed. 

• Some of the amino acid correlations depend on the type 
of amino acid residue. 

• There is a potential for selecting the composition of 
amino acids seen in soybeans. 



Amino Acids, 

p.2 

• A few amino acid residues vary much more 
across some soybean varieties than others 

• The combined NIR, NMR and GC-MS data 
for amino acids in soybean seeds shows the 
possibility of generating reliable calibrations 
for selected triplet groups of amino acids 
using FT-NIR Spectroscopy. 

• NIRS data for AA can be validated and may 
thus become a very useful tool for cross- 
breeding and genetic selection purposes. 



3. Proposed Research Methods 



i. Techniques, Data analysis and Expected 
Results 

Primary and secondary techniques ; Chemical and 
Hyperspectral NIR Imaging ; Fluorescence 
Correlation Spectroscopy and Microspectroscopy) 

Data Analysis: Principles of NIRS, Data Corrections, 
Regression Algorithms and Multivariate analysis. 

I. Limitations and Advantages of proposed 
methods 

II. Tentative schedule to conduct major steps 



I. Techniques, Data analysis and 

Expected Results 

A. TECHNIQUES: 

• NIR Spectroscopy, 

• GC-MS 

• HPLC and IEC 

• Chemical Imaging, 

• Fluorescence Correlation Spectroscopy 

B. Data Analysis 



C. Expected Results 



NIR Analysis of Amino Acids & Proline 
in Soybeans 

• Wet chemistry analysis : 

" Ion Exchange Chromatography (IEC) / UV / vis 
Abs. / Fluorescence 

Derivatization HPLC 

- GC-MS (USDA - Peoria) 

• NMR as Primary Method 



Soybean Proteins Content by C-13 NMR 

and Sigma Methods 




Source: Baianu, You, Costescu, Prisecaru & Nelson. AOCS Proc, (2005) 



Principles of NIR Spectroscopy 

^"IR absorption spectra occur because the atom-to-atom 
bonds within molecules can vibrate and rotate thus 
generating series of different energy levels among which 
rapid transitions can occur. 

^"According to Quantum Mechanics, the vibro -rotational 
energy levels of a molecule can be approximated by the 
following equations: 



Enir = Erot + Evib +Eanh = j{j + \)Bhc + [1 - x(n + 1 / 2)]h V 



• where: j : rotation quantum number = 0,1,2, 
n: vibration number — 0, 1 ,2, ... ; 
E: Energy eigenvalues , and 
x: anharmonicity constant (~0.01). 



Current Near-Infrared Instruments 
Techniques 



Current NIR instruments utilize EM radiation with 
wavelengths from ~750 nm to 2S00 nm. Their operation is 
based on the fact that molecular bonds stretch and/ or bend, 
thus causing absorption bands at certain characteristic IR and 
NIR wavelengths that are proportional to the amount of the 
absorbing components present in the sample, e.g., amide 1 and 
2 bands. 



Apparent Absorbance 




^The calculated absorbance is usually referred to as 
the "apparent absorbance," and it can be significantly 
affected by Specular Reflection and Light Scattering, 
even for thin samples. 

^Therefore, in order to obtain reliable NIR 
quantitation, Spectral Pre-Processing and 
Corrections are always required. 



Data Correction Problems 



# We found that the NIR methods currently 
employed in industry for: 

• spectral pre-processing 

• correction of light scattering 

• specular reflection effects 

are in need of substantial improvements in order to 
produce high accuracy, robust and stable 
calibrations for rapid composition analyses of seeds. 



Light Scattering Corrections for 
Soybean NIR Spectra 

Spectral variations between soybean samples can be caused by: 

• chemical composition differences 

• (i.e., what you want to measure) 

• Spurious effects* 

• specular reflection 

• scattering effects 

• internal reflection 

* These do not monitor chemical composition -- and are 
therefore measurement artifacts that are undesirable and distort 
the data. 



Fourier Transform NIR 



Recent NIR Spectroscopy techniques utilizing 
Fourier Transform (FT) fulfill all these conditions, 
but require pre -calibration by AO CS -approved wet 
chemistry techniques, using well-defined and stable 
sample standard sets of 50 to 100 different samples. 



FT-IR Spectrometer Spectrum One and 
FT-NIR Spectrometer SpectrumOne-NTS 




Introduced in 2001 by Perkin 
Elmer Co. (Shelton, CT, USA) 
for High Sensitivity, high- 
resolution and long-term 
stability 

SpectrumOne and Spectrum 

One NTS have a similar look 
but are configured for 
different spectral ranges (e.g., 
IR and NIR, respectively). 




(Perkin Elmer Co., USA) 



Comparison of Soybean Spectra Collected with either 
Perten DA7000 or the PE Spectrum One NTS (with Extd 
InGaAs/NIRA) NIR Instruments 



1.6 




500 700 900 1100 1300 1500 1700 1900 2100 2300 2500 

Wavelength, nm 



Lambert-Beer's Law 

^ Absorption is a universal spectroscopic phenomenon 
that has immediate chemometric applications, because 
it is directly related to the constituent concentration 
as described by: 

Lambert-Beer's Law, which states that . . . 

A = 8 * C * L where: 

A = True Absorbance 

Extinction coefficient of analyte 

Concentration 

PatMength of light 




Lambert-Beer's Law, Con't 



The absorbance of a sample is difficult to measure directly 

In practice, the absorption is often calculated indirectly from the 
measurement of the reflectance (A = Log 1/R), or transmittance (A 
= Log 1 /T), that can be readily measured even for thick samples, 
provided these do not possess a composite structure, such as thick, 
multiple layers with different compositions 

The calculated absorbance is usually referred to as the 'apparent 
absorbance, ' and it can be significantly affected by Specular reflection 
and light scattering even for thin (e.g., 5 mm) samples. 

Therefore, in order to obtain reliable NIR quantitation, spectral pre- 
processing and corrections are always required 



Light Scattering Corrections for Soybean 

NIR Spectra 



Spectral variations between soybean samples can be caused by 
chemical composition differences, specular reflection, as well as 
light scattering effects ( that do not monitor chemical composition 
— and are therefore a measurement artifact) 

The effects of light scattering and/ or specular reflection on the 
NIR spectra of soybean needbe investigated, and eliminated if at 
all possible 



NIR Light Scattering Corrections, 

P-2 



One finds that the methods that are currently employed by the 
NIR industry for spectra pre-processing and corrections of light 
scattering and/ or specular reflection effects are in need of 
substantial improvements in order to produce calibrations that 
are : 

• Highly accurate 

• Robust 

• Lead to stable calibrations for rapid composition analyses 
of seeds 



1 



SpectrumOne NTS Spectra of Bulk Soybean Samples, before (A) and 
after (B) Multiple Scattering Correction (MSC) 



I t.l ilit.ili-il 



n -• 
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n 7 - 



Before: Raw data 




Source: UIUC Soybean NIR Database, 2007 



Detrimental Effects of Light Scattering on the 
Accuracy of NIR Analysis for Whole Soybeans 



(measured with the FT-NIR, SpectrumOne NTS Spectrometer) 





R 


SECV 


Component 


No MSC 


MSC 


No MSC 


MSC 


Protein 


99.5 


99.9 


0.63 


0.26 


Oil 


99.3 


99.9 


0.29 


0.13 


Moisture 


99.8 


99.9 


0.26 


0.17 



R: Correlation coefficient 

SECV: Standard Error of Cross Validation 

(Tested with 65 bulk whole soybean standards) 

Source: T. You, 2006 



FT-NIR Spectra of Five Major 
Soybean Components 

Collected on the Perkin- Elmer SpectrumOne NTS FT-NIR Spectrometer 



1.4 




800 1000 1200 1400 1600 1800 2000 2200 2400 



Wavelength, nm 

[Source: T. You, 2006] 



SpectrumOne NTS FT-NIR Spectra of Soy Protein Isolates (SPI) in H 2 0, 

before (A) and after (B) Multiple Scattering Correction (MSC) 

{Source: \.C. Baianu et a\., 2009) 
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Number of 
Factors 


R 


SECV 


Componen 
t 


NoMSC 


MSC 


NoMSC 


MSC 


NoMSC 


MSC 


SPI 


6 


6 


0.998 


0.999 


0.87 


0.54 


H20 


6 


6 


0.998 


0.999 


0.87 


0.54 



R: Correlation coefficient 

SECV: Standard Error of Cross Validation 

Source: T. You, 2006 



SECV 



The Standard Error of Cross 



Validation (SECV) is defined as: SECV = \/n x - yf 

V f-i 



where n is the total number of 
samples, y i denotes the standard 

value of the component concentrationy, 
and denotes the predicted component 
concentration 



SEP 



The standard error of prediction is the 

standard deviation of the sample mean 
estimate of a population mean. 

Usually estimated by the sample estimate of 
the population standard deviation (sample 
standard deviation) divided by the square 
root of the sample size: 

where 

s is the sample standard deviation (i.e. the 

sample based estimate of the standard 

deviation of the population), and 

n is the size (number of observations) of the 

sample. 



Illustration of the Interactive, Spline Baseline Correction 
(BC)* of the FT-NIR Spectrum of 

a Whole Soybean Seed, with Coat 
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• Note : Only the PE Spectrum program supports this interactive, spline- 
function, Fx, baseline correction (BC), shown above in purple color. 



Illustration of the Interactive, Spline Baseline Correction 
(BC) * of the FT-NIR Spectrum of A Whole Soybean Seed, 

with Coat 
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• Note : Only the PE Spectrum program supports this interactive, spline-function, 
Fx, baseline correction (BC), shown above in purple color. 



True FT-NIR Absorption Spectrum of 
Soybeans with Coat 




Calibration 



• Generate or Select a suitable set of Standard Samples of 
Known composition 

• Obtain Raw FT-NIR data 

• Correct Data for Multiple Scattering 

• Use Lambert- Beer Law in conjunction with iterated data 
regression by PLS-1 or 2; check up on PLS-1 precision and 
correct computation; 

• Examine the Calibration's Linear Correlations and 
Validation/ Composition Predictions 



Computer Simulation Study of the PLS-1 

Calibration Algorithm 
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N=21, Calibration Standards 



Source: T. You, 2006 



Computer Simulation of PLS-1 Algorithm 
with 3 Components (%) 




Loading Vectors and SECV s of Components CI to C3 for the 

Simulated PLS-1 Calibration Algorithm, without any Noise 
...except from negligible PC computation errors ! 
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R: Correlation coefficient 

SEP: Standard Error of Prediction 

* Ideal* conditions, that is. without Noise! 



Predicted vs. Reference Concentrations of 
Component C1 for our PLS-1 Simulation 
Study 
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REFERENCE: 'Actual' Value 



Source: T. You, 2006 



II. Limitations and Advantages of 
Proposed Procedures 



Major Limitations 

• Amino Acids affected by acid hydrolysis of the 
protein: 

• Absence of Trp data. Trp is just an example 
of what hydrolysis does to amino acids. 
Corrections to hydrolysis is one of the 
limitations caused by hydrolysis 

• No data extrapolation to Zero 

• Correlation of amino acids with Dry Protein 

• Extremely limited range of ROV's makes it 
impossible to do the calculations for those 
amino acids 



Matrix Effects 



• A major obstacle that exists in the comparison of N IRS 
calibration data for different types of samples with the 
same chemical/biochemical composition but in different 
form or phase is the so-called "matrix effect" 

• This "matrix effect" depends on the state in which the 
molecules are in: Solid, Liquid phases and also on 
different: Texture/Morphology, Particle Size 
Distribution, Molecular Alignments. 

• Other related causes : internal gaps, different 
interfaces, internal sample changes in refractive index, 
and so on. 



NMR Advantages 



• Trp data included 

• in situ data acquisition 



NIRS Calibrations for 
Selected AA Groups 

• It is very important that there are enough 
knowns in the protein spectra to solve for 
all of the amino acid residues selected 

• This research project will be using four 
groups of amino acid triplets such as: 

• Arg— Ala— Val 

• Cys (and/or Met) — Lys-Ser 



erivatized HPLC Advantages 



• Derivatization prior to acid hydrolysis followed 
by HPLC eliminates the limitations caused by 
the protein treatment with acid. 



Planned Work, p.2 



Planned work also includes: 

• Crystalline amino acids powders vs. amino acids in 
gels 

• Egg white vs. egg albumin data 

• The making of concentrated solutions, followed by 
partial drying resulting in an amorphous gel rather 
than crystalline powders. 



Timeline and Steps 



1 . Select the best primary method for analyzing protein and 
amino acids 

• GCMS, NMR 

Both are useable and comparable 

NMR is superior 

No acid hydrolysis 

Readily available 

1 . Obtain dry protein and moisture contents 

2. Obtain PLS-1 for protein content 



Model Systems Rationale 

• Why do we need a model system? Why 
can't we skip this step and just run our 
soybean samples? 

• We need knowledge of the bands assigned 
from the model systems (e.g., Tyrosine ring, 
Amide II band) 

Serves to show the scale of the level of 
errors we can expect to get 

Others have tried and failed 



Timeline and Steps, cont'd 



• In principle, all amino acid combinations 
should be performed, but we will do 4 triplets 

• TIME FACTOR: Running the samples 

• 4 x 1 00 samples x 2 (duplicates) 

• = 800 samples 

• 5 minutes/sample.... 4,000 minutes 

• Or one month 

• TIME FACTOR: Making the samples 

• 1 week per sample, parallel proc. 

• Total estimated time: several months 



Experimental Setup 

• For the selection of the individual amino acids to be analyzed, 
the following matrix is measured for pure amino acid 
powders: 

• #1: Pro-Glu-Ala 

• #2: Arg-pSer-Ile 

• #3: Lys-Cys-Phe 



Computer Simulation of PLS-1 Algorithm 
with 3 Components (%) 



Arg 




Adapted from T. You, 2006 



Planned Work, cont'd 




Similar model calibrations will be carried out for mixtures of other amino 
acids, triple selected, such as: Cys + Met - Arg--Ala or Met - Asn - Lys 



Planned Work, cont'd 

• Future work may include a "triangle" triplet of 



• Pro-Glx-Val 

• Met+Cys-Arg-Ala 

• Soy protein isolates (SPI), egg albumin and lysozyme 

SPI 



B 
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Lysozyme c 




0 10 20 30 40 50 60 70 80 90 100 



Egg albumin 



Moisture Calibration for Bulk Soybean Analysis on 
the Spectrum One NTS FT-NIR Instrument 




65 calibration standards, 20 grams for each standard, 9mm NIR beam size 

Source: Soybean NIR Database, UIUC 



Bulk Soybean Calibration with 65 Standards 
for Protein, Oil, and Moisture Analysis 

Developed with Data from the Spectrum One NTS FT-NIR Spectrometer 



Component 


Number of 
Factors 


R 


SECV 


SEP 


Protein 


13 


99.9% 


0.26 


0.33 


Oil 


15 


99.9% 


0.13 


0.23 


Moisture 


15 


99.9% 


0.17 


0.30 



R : Correlation Coefficient 

SECV: Standard Error of Cross Validation 

SEP: Standard Error of Prediction Source: Soybean NIR Database, UIUC 



Detrimental Effects of Light Scattering on the 
Accuracy of NIR Analysis for Whole Soybeans 

(measured with the FT-NIR, SpectrumOne NTS Spectrometer) 





R 


SECV 


Component 


No MSC 


MSC 


No MSC 


MSC 


Protein 


99.5 


99.9 


0.63 


0.26 


Oil 


99.3 


99.9 


0.29 


0.13 


Moisture 


99.8 


99.9 


0.26 


0.17 



R: Correlation coefficient 

SECV: Standard Error of Cross Validation 

(Tested with 65 bulk whole soybean standards) 

Source: Soybean NIR Database, UIUC 



A new and Improved Set of 124 Bulk Soybean 
Standard Samples: 

Protein-Oil Inverse Correlation for the Year 2003 Calibration Standard 



24.0 




35.0 40.0 45.0 50.0 55.0 60.0 

Protein, % dry weight 

* 124 Standard samples were selected with a wide range of Protein and Oil 

concentrations that were uniformly distributed in 0.5% concentration 

steps for Protein, and in 0.2% steps for Oil. Source: Soybean NIR Database, UIUC 



NIR Illustration 




Golden Cove (Digital Color Infrared) 

Gold Beach, Oregon 

http:/Mww. pbase.com/image/381 88240 



Chemical 



ind Hyperspectral NIR 
Imaging 



Chemical and Hyperspectral Imaging 
of Amino Acid Residues and Proteins 

in Soybeans 



Images Using the NIR Autolmage 
FT-NIR Microspectrometer: 



Introduced in 2003 by 
PerkinElmer Co. (Shelton, CT, 
USA) for high-resolution studies. 

Employed for our NIR 
Microspectroscopy and Chemical 
Imaging investigations of 
Soybean seeds. 




Microscope Coupled to 
the FT-NIR Spectrometer 



Image of the Perkin Elmer Autol mage 
Microspectrometer "Innards" 




FT-IR Chemical Image (Left) and Visible Light 
Micrograph (Right) of a Black Coat Soybean with 
Part of the Coat Removed 




FT-NIR Chemical Image of Oil Distribution 
in a Mature Soybean Embryo Section 
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